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(54) Load distribution among servers in a TCP/IP network 



(57) Methods and apparatus for hosting a network 
service on a cluster of servers, each including a primary 
and a secondary Internet Protocol (IP) address. A com- 
mon cluster address is assigned as the secondary ad- 
dress to each of the servers in the cluster. The cluster 
address may be assigned in UNIX-based servers using 
an ifconfig alias option, and may be a ghost IP address 
that is not used as a primary address by any server in 
the cluster. Client requests directed to the cluster ad- 
dress are dispatched such that only one of the servers 
of the cluster responds to a given client request. The 
dispatching may use a routing-based technique, in 



which all client requests directed to the cluster address 
are routed to a dispatcher connected to the local net- 
work of the server cluster. The dispatcher then applies 
a hash function to the client IP address in order to select 
one of the servers to process the request. The dispatch- 
ing may alternatively use a broadcast-based technique, 
in which a router broadcasts client requests having the 
cluster address to all of the servers of the cluster over 
a local network. The servers then each provide a filtering 
routine, which may involve comparing a server identifier 
with a hash value generated from a client address, in 
order to ensure that only one server responds to each 
request broadcast by the router. 
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The present invention relates generally to data 
commumca.™ networks such as the Interne, and mo e 
particularly «o techniques for hosting network series 
on a cluster ot servers used to deliver data overTne, 
work ,n response toclient requests, where the ds, er0f 

Background m th„ i~w rn fi,m 

With the explosive growth of the World Wide Web 
many popular .n.erne, web sites are heavily 
ctent requests. For example, it has been reported in S 
L Garfmkel, The Wizard of Netscape." Webber 
Maga^ne, July/August , 996, pp. 59-63^ hc-me pag- 
es of Netscape Communications receive more than^o 
m^.on client requests or "hits' per day. A single senS 

. J?, J 3 SSfV,Ce iS USUa '^ not to handle Z 

type of aggressive growth. As a result, clients may ex 
penence slow response times and may be unable to ac- 
cess certain web sites. Upgrading the servers to more 

Cher cZT^ "** "* * 
Another common approach involves deploying a set of 

machines, a.so known as a cluster, and cJf gu ring h e 

machmestoworktogethertohostasingle service luch 
a server cluster should preferably publicize on,y one 

t on h ,0f * he 6ntire C ' USlef so any conffgura 
fon change inside the cluster does not affect c2 ao 
plications. The Work, Wide Web and o.hertS s t 
the Internet utilize an application-level protocol knol 
as the Hypertext Transfer Pro.oco. (HTTP^ 
based Ion a client/server architecture. The HTTP ^roto 

SiT^r? de,aii in <Hypertext 

1996 !h«^/ • NetW ° rk Workin 9 Gro "P. May 
1996, <hnp.//www.,cs.uci.edu/pub/ietf/http>, which is 
incorporated by reference herein. 

FIG. 1 illustrates an exemplary client/server archi 
ecture suitable for implementing HTTP-basXS " 

ST£r«?r AC,ient 12 ^ra.esanHTTP 
request for a particular service, such as a request for 

information associated with a particular web sfte and a 
Transm ISS i 0n Control Protocol/Internet Protio'(TCP/ 
IP) connects « then established between the clienr 2 
and a server 14 hosting the service. The client llll 
is covered to the server 14 in this example t a Tc" 
IP connect over a first network 16, a router 1 8 and a 
s cond network 20. The firs, network 16 may be a Zte 
area communication ne«wor k such as the Internet Zt 
the second network 20 may be an Ethernet or otherTyl 
of local area network (LAN) interconnecting serve d 
w h other servers in a server cluster. The router 18 also 
referred to as a gateway, performs a relaying function 
between the firs, and second networks Jich is "ra ns 
parent to the client 12 



J T reqUeS ' iS grated by a web browser 
eaten layer 22 _, 0( the c|jent ^ 9 ^ £J* 

non layer 22-2 of the server 14. The requested network 

catorTum I' d K eS, ' 9nated bV 3 Unif ° nn "evo- 
cator (URL) which includes a domain name identifying 

\* initiates the TCP/IP connection by requestinq a local 

e Zaf ^ ^ (DNS) ,0 - P ,ne s^- 

er domain name to an IP address. The TCP and IP pack- 
et routing func.bns in client 12 and server 14 are ™ 

ctted wiih the? ^ ' P ' ayerS are 9enerall V «-o- 

<OSn TrtT^T ° Pen Systems 'n«erconnection 
(OS ) model. The TCP layers 24-1, 24-2 process TCP 
packets of the Cent request and server £££ \Z 

L rt PaC K 6lS ^ inClUde 3 TCP identifying a 

grt number of the TCP connection between theS 
12 and server 14. The IP layers 26-1, 26-2 process IP 

pacet sf o rme dfromtheTCPpacketsoftheTXy e r 
The IP p ack e, s each inc.ude an IP header identifyfnq an 

» es the resulting IP address as a destination address Tn 
the IP packet headers of client request packets Th IP 
address together with the TCP port number provide the 
complete transport address for the HTTP servT proc 
35 !£" I 6 C,lent 12 and Server 14 also include data ? k 
ero^T al 'f yerS28 - 1 ^P^ingframingandl 
eroperat.onstoconfigureclient request orrep? packets 
ortransm.ssionoverthe networks 16and20 TherouS 
includes data link and physfca. byJSJ^ 

based on IP addresses. The server 14 responds to a 
grven Cent request by supplying the requested informa 
tonover^eestablishedTCP/IPconnectioninani 
o, repfy Packets. The TCP/.P connection is then closed 

HtJ h T many knOW tech "*q"es for distributing 

anV 3 i,l e t l reqUeStS '° 3 C,US,er " Se ™ s F 'GS 2 
and 3 illustrate server-side single-IP-address imaqe ap- 
proaches whfch present a single IP address .0 the 1 
ents. An example of this approach is the TCP router ap- 

Sfver" P^l J SCab , ble and H * h, y Avai| ab'e Web 
fqqT h P l° Ceedlnas of COMPCON '96, pp. 85-92 
996, which ,s .ncorporated by reference herL FIG 2 

^a,es Ihe TCProu, e rap P roachinwhicha lieTl2 
establishes a TCP/IP connection over Interne. 30 w h - 
a server-side rou.er 32 having an IP address RA The 

Stlu'di^ 

34 including N servers 14-i, i = 1, 2 , ... N , having respec- 
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tive IP addresses S1, S2, . .. SN. Each server of the clus- 
ter 34 generally provides access to the same set of con- 
tents, and the contents may be replicated on a local disk 
of each server, shared on a network file system, or 
served by a distributed file system. 

The single-address image is achieved by publiciz- 
ing the address RA of the server-side router 32 to the 
clients via the DNS. The client 12 therefore uses RA as 
a destination IP address in its request. The request is 
directed to the router 32, which then dispatches the re- 
quest to a selected server 14-k of server cluster 34 
based on load characteristics, as indicated by the 
dashed line connecting client 1 2 to server 1 4-k via router 
32. The router 32 performs this dispatching function by 
changing the destination IP address of each incoming 
IP packet of a given client request from the router ad- 
dress RA to the address Sk of selected server 1 4-k. The 
selected server 14-k responds to the client request by 
sending reply packets over the established TCP/IP con- 
nection, as indicated by the dashed line connecting 
server 14-k to client 12. In order to make the TCP/IP 
connection appear seamless to the client 12, the select- 
ed server 1 4-k changes the source I P address in its reply 
packets from its address Sk to the router address RA. 
The advantages of this approach are that it does not in- 
crease the number of TCP connections, and it is totally 
transparent to the clients. However, since the above- 
noted source IP address change is performed at the IP 
layer in a given server, the kernel code of every server 
in the cluster has to be modified to implement this mech- 
anism A proposed hybrid of the DNS approach and the 
TCP router approach, in which a DNS server selects one 
of several clusters of servers using a round-robin tech- 
nique, suffers from the same problem. 

FIG. 3 illustrates a server-side single-address im- 
age approach known as network address translation, as 
described in greater detail in E. Anderson, D. Patterson 
and E. Brewer, "The Magicrouter, an Application of Fast 
Packet Interposing," Symposium on Operating Systems 
Design and Implementation, OSDl, 1996, <http://www, 
cs.berkeley.edu/~eanders/magicrouter/ osdi96-mr- 
submission.ps>, and Cisco Local Director, <http://www. 
cisco.com/warp/public/751/lodir/index.html>, which are 
incorporated by reference herein. As in the TCP router 
approach of FIG. 2, the client 12 uses the router address 
RA as a destination IP address in a client request, and 
the router 32 dispatches the request to a selected server 
14-k by changing the destination IP address of each in- 
coming request packet from the router address RA to 
the address Sk of selected server 14-k. However, in the 
network address translation approach, the source IP ad- 
dresses in the reply packets from the selected server 
14-k.are changed not by server 14-k as in FIG. 2, but 
are instead changed by the router 32. The reply packet 
flow indicated by a dashed line in FIG. 2 thus passes 
from server 14-k to client 12 via router 32. 

Compared to the TCP router approach of FIG. 2, 
network address translation has the advantage of server 



transparency. That is, no specific changes to the kernel 
code of the servers are required to implement the tech- 
nique. However, both the TCP router and network ad- 
dress translation approaches require that the destina- 

5 tion address in a request packet header be changed to 
a server address so that the server can accept the re- 
quest. These approaches also require that the source 
address in a reply packet header be changed to the rout- 
er address so that the client can accept the reply. These 

10 changes introduce additional processing overhead and 
unduly complicate the packet delivery process. In addi- 
tion, because of the address changes, the above-de- 
scribed single-address image approaches may not be 
suitable for use with protocols that utilize iP addresses 

15 within an application, such as that described in K. 
Egevang and P. Francis, "The IP Network Address 
Translator," Network Working Group, RFC 1631, <http: 
//www. safety. net/ rfc1631 .txt>, which is incorporated by 
reference herein. Furthermore, in both the TCP router 

20 and network address translation approaches, the router 
32 needs to store an IP address mapping for every IP 
connection. Upon receiving an incoming packet associ- 
ated with an existing TCP connection, the router has to 
search through all of the mappings to determine which 

25 server the packet should be forwarded to. The router 
itself may therefore become a bottleneck under heavy 
load conditions, necessitating the use of a more com- 
plex hardware design, as in the above-cited Cisco Local 
Director. 

30 It is therefore apparent that a need exists for im- 
proved techniques for hosting a network service on a 
cluster of servers while presenting a single-address im- 
age to the clients, without the problems associated with 
the above-described conventional approaches. 

35 

Summary of the Invention 

The present invention provides methods and appa- 
ratus for hosting a network service on a cluster of serv- 

40 ers. All of the servers in a server cluster configured in 
accordance with the invention may be designated by a 
single cluster address which is assigned as a secondary 
address to each server. All client requests for a web site 
or other network service associated with the cluster ad- 

^5 dress are sent to the server cluster, and a dispatching 
mechanism is used to ensure that each client request is 
processed by only one server in the cluster. The dis- 
patching may be configured to operate without increas- 
ing the number of TCP/IP connections required for each 

50 client request. The invention evenly distributes the client 
request load among the various servers of the cluster, 
masks the failure of any server or servers of the cluster 
by distributing client requests to the remaining servers 
without bringing down the service, and permits addition- 

55 al servers to be added to the cluster without bringing 
down the service. Although well -suited for use in hosting 
web site services, the techniques of the present inven- 
tion may also be used to support a wide variety of other 



server applications. 

In an exemplary embodiment of the invention a net 

=S==S5£ 
5K33SSS 

network of the server duster and is also coupled viaTe 
mat each of the requests is processed by only one of 

====== 

result to a server identifier to determine whether thai 
server should process the client request 

disnl h teChn, ' qUeS ° f ,he Presem iwe "«*on providefast 
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address in a reply packet header be changed to ih* ™ . 

requ.red. These and other features and a^^I s 0 
he present invention will become more appaSom 

Sc%rn panyin9 drawin9s and ,he 



rs 



Brief Description of th» rw,..^. 



FIG 1 is a block diagram illustrating a conventional 

«» fort 2 j" US,ra,es a P rior art TCP router technique 
F G. 3 .llustrates a prior art network address trans 

FIG. 4 illustrates a technique for hosting a network 
serv.ce on a cluster of servers using roLgSe d 
patching in accordance with an exempt 
bodiment of the invention and 
FIG 5 i| lusIrates a technjque for 

» asTd,^ ? US,6r ° f S6rVerS "** bro ^ 
based d,spatch,ng ,n accordance with another ex- 
emplary embodiment of the invention. 



Detailed Description of th„ i—nt,^ 



acJZTT COn,r °' Pra ^ 0l /.nternet ProtoS 

40 Z»T ] S,andard 11 shou,d ^ understood howem 
hatthemventionisnotw^ 

^^orr*twDrt«corm,unicalionS£^ 

3 9r ° UP °' S6rVerS COn,i 9 ured to su^rt a ne -" 

*pe of cluster address in the form of an IP address 
whrch (S nol used gs a P addres 

indr,o s r c r d c,us ^ The ,erm "« 
^ s ^r:;:^ ,ntem r si,es 

55 -mechankmT !, a " y ° ther data Ua ™1er 

ma ' M <tts o, J,TC 
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t packet, depending on the nature of the request. 

The present invention provides an improved single- 
! address image approach to distributing client requests 

I to servers of a server cluster. In a preferred embodi- 

; ment, the invention allows all servers of a server cluster 

! to share a single common IP address as a secondary 

i address. The secondary address is also referred to 

herein as a cluster address, and may be established us- 
ing an ifconfig alias option available on most UNIX- 
based systems, or similar techniques available on other 
systems. The cluster address may be publicized to cli- 
ents using the above-noted Domain Name Service 
(DNS) which translates domain names associated with 
Uniform Resource Locators (URLs) to IP addresses. All 
I client requests to be directed to a service hosted by the 

! server cluster are sent to the single cluster address, and 

| dispatched to a selected one of the servers using rout- 

; ing-based or broadcast-based dispatching techniques 

i to be described in greater detail below. Once a server 

| is selected, future request packets associated with the 

j same client request may be directed to the same server. 

I All other communications within the server cluster may 

utilize primary IP addresses of the servers. 

The above-noted ifconfig alias option is typically 
used to allow a single server to serve more than one 
domain name. For example, the ifconfig alias option al- 
lows a single server to attach multiple IP addresses, and 
thus multiple domain names, to a single network inter- 
face, as described in "Two Servers, One Interface" <ht- 
tp://www. thesphere.com/-dlp/TwoServers/>, which is 
incorporated by reference herein. Client requests direct- 
ed to any of the multiple domain names can then be 
serviced by the same server The server determines 
which domain name a given request is associated with 
by examining the destination address in the request 
packet. The present invention utilizes the ifconfig alias 
option to allow two servers to share the same IP ad- 
dress. Normally, two servers cannot share the same IP 
address because such an arrangement would cause 
! any packet destined for the shared address to be ac- 

| cepted and responded to by both servers, confusing the 

| client and possibly leading to a connection reset. There- 

: fore, before a server is permitted to attach a new IP ad- 

dress to its network interlace, a check may be made to 
ensure that no other server on the same local area net- 
work (LAN) is using that IP address. If a duplicate ad- 
dress is found, both servers are informed and warnings 
are issued. The routing-based or broadcast-based dis- 
patching of the present invention ensures that every 
packet is processed by only one server of the cluster, 
such that the above-noted warnings do not create a 
problem. 

An alternative technique for assigning a secondary 
address to a given server of a server cluster in accord- 
ance with the invention involves configuring the given 
server to include multiple network interface cards such 
that a different address can be assigned to each ol the 
network interface cards. For example, in a UNIX-based 



system, conventional ifconfig commands may be used, 
without the above-described alias option, to assign a pri- 
mary IP address to one of the network interface cards 
and a secondary IP address to another of the network 

5 interface cards. The secondary IP address is also as- 
signed as a secondary IP address to the remaining serv- 
ers in the cluster, and used as a cluster address for di- 
recting client requests to the cluster. 

The exemplary embodiments of the present inven- 

10 tion to be described below utilize dispatching techniques 
in which servers are selected based on a hash value of 
the client IP address. The hash value may be generated 
by applying a hash function to the client IP address, or 
by applying another suitable function to generate a hash 

15 value from the client IP address. For example, given N 
servers and a packet from a client having a client ad- 
dress CA, a dispatching function in accordance with the 
invention may compute a hash value k as C A mod (N- 
1) and select server k to process the packet. This en- 

20 sures that all request or reply packets of the same TCP/ 
IP connection are directed to the same server in the 
server cluster. A suitable hash function may be deter- 
mined by analyzing a distribution of client IP addresses 
in actual access logs associated with the servers such 

2S that client requests are approximately evenly distributed 
to all servers. When a server in the cluster fails, the sub- 
set of clients assigned to that server will not be able to 
connect to it. The present invention addresses this po- 
tential problem by dynamically modifying the dispatch- 

30 ing function upon detection of a server failure. If the hash 
value of a given client IP address maps to the failed serv- 
er, the client IP address is rehashed to map to a non- 
failed server, and the connections of the remaining cli- 
ents are not affected by the failure. 

35 FIG. 4 illustrates a routing-based dispatching tech- 
nique in accordance with the present invention. Solid 
lines indicate network connections, while dashed lines 
show the path of an exemplary client request and the 
corresponding reply. A client 52 sends a client request 

40 to a server cluster 54 including N servers 54-i, i = 1 , 2, ... 
N having IP addresses S1, S2, ... SN and interconnect- 
ed by an Ethernet or other type of LAN 56. The client 
request is formulated in accordance with the above-de- 
scribed HTTP protocol, and may include a URL with a 

is domain name associated with a web site or other net- 
work service hosted by the server cluster 54. The client 
accesses a DNS to determine an IP address for the do- 
main name of the service, and then uses the IP address 
to establish a TCP/IP connection for communicating 

50 with one of the servers 54-i of the server cluster 54. In 
accordance with the invention, a "ghost" IP address is 
publicized to the DNS as a cluster address for the server 
cluster 54. The ghost IP address is selected such that 
none of the servers 54-i of cluster 54 has that IP address 

55 as its primary address. Therefore, any request packets 
directed to the ghost IP address are associated with cli- 
ent requests for the service of the single-address image 
cluster 54. The use of the ghost IP address thus distin- 



serveraddresses, an^Tlf^ *" 
Pnmary address activities ' n,erfere nce with these 

The request is direaedZ . V ° theserv er cluster 54. 
having an IP ad*e»fu^J ' 60 to a router 62 
-9 table h^a^JJS; 82 
'"9 request packets having f h C ° rdd ' recl,n 9 any incom- 

64 includes an operatinq J£r, f The dis Patcher 
router mode, using a ou.I , C ° nf ' 9Ured ,0 ™ in a 

"*ents, the functions of th Sl "^ 6 6mbod - 
corpcrated into the router 62 in ? 64 C0U,d be ^« 

of the cluster 54 utiles the aLl !? °' ? Servers 
^option to set the ghos, , P a ^ ifCOnfi ^«- 
^ress. As noted S^SS " *" 
seconda^addressforeacho fhl '° f Sel,in 9 a 

does no, require anf^^T™' 
n-ng on the servers! a tm °" ** k6me ' code ™ 
of the servers £n^?^™* ° ne « 
multiple network interface ^1 ""^^ to inc,u de 
««* that a different ad£££ Z 
<* the network interface cart s T aSS ' 9ned ,0 eac h 
U ^ a ^ server using a 

The router 62 rout*™ m '' ar ,ec hnique. 

»P Address to fSSSSS^ Qhost 
above-noted routing ^ ^ a ^ a <- *h the 
'hen applies a hash functioning T d ' Spatcher 64 
a given request packet C " ent ,P ad dress in 

^theg^enpaS 2SE^^™--™ 
"'ustrated in FIG. 4 the InL k J°' ln,hee *am P le 
'unction to the IP address of ' 64 app,ies a hash 
'hatthecorrespond?^^ 

routes the request packet to.ho SP3,Cher 64 
56. as indicated by^he da hiS T™ *** over 
address S2of serve'54-2 todS *• US ' n9 ,he 
serversof clus,er54 Au T^T ^ Mother 
er 54-2 accepts the pack! ^ ^ of se ^ 

may be baseS on ,h ' 6Ve ' ^ssing 
'hedestination address jn7hp ? reSS because 'hat is 
f V in the app»S^^^-ndp* 

processing the request tSZ COn,ents - A «er 

to the client 52 v£Z^y*J»* r ^ directlv 
•P oonnectbn, u S i ng 7h e Xs7,P TCP/ 
Passing .through theVpaTche' ***** ** "'"out 

^oCh^^ 
-""ace of dispatchef^^^^ 
samenetworkinterfaceford 3 ," P " back onto 'he 
54-i over LAN 56 it m v ™ * *° 000 ° f ,he servers 
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frver, as describe, " ^^* tota *««n*ii 
TCP/IP „ lustraIed Vo) n Qre f '«r detail ,n W.R. S ,e Vens , 

5 grated by reference herl Z^' is in- 
undesirable in the routine , ' lhis effect * 
dispatcher 64 peSt ^ ° f RG 4 

m 'o sup P r ess the ICMP host ZT "** be necessa <y 
10 9hos. IP address by tor exam? ^ messa 9 e far the 
•he corresponding oplSn^ \ removin 9 °r altering 
P^her. In the Code *» the dis' 

rnents in which the dis D r h , a " ema,ive e ^o6i- 
is -ithin the ^•r^S^S*" is demented 
15 generated and therefore need n<ZT is ™ 

other potential We"™"™ SUPPressed ^n- 
■ •« back to the cfen7 52 f Z I " 8 rep ' y packet 
«-2 with the ghos, , P ao^T^ 89,9C,B,, SerVer 
router 62 to associate ints Ad 2 L ayCa " Se,he 
20 f ^) cache, the ghosUP a ^ Pro, °- 
dress of the selected^! rjt n** "* UN atJ - 
cache is described in orlat J h , pera,,on of the ARP 
TCP/IPIIIustrated, Vo, f Chs ^ ' WR S,6Vens - 
is incorporated by refer^nrl k 5 ' PP ' 53 " 68 - which 
* bodiment of FIG 4^1^'" The em 
-«ng the reo^^-^by.*^,^ 

•hen dispatching baSd on I V 64 - a "d 

dress, such that, he rout" R Pc !fr rpriTOr y ,P a d- 
F| G. 5 illustrates a LTl ^ ' S not use d- 
30 •echnique in aJZV^T* ****** 
Again, solid lines indicate nl , P Se "' inven,i on. 

dashed lines show hr palh 0 T C ° nneCti0ns ' wni 'e 
guest and thecorresolS T exem P^ client re- 
3s ^-hasedernbod^^^ 
55 to server cluster 54 indud S Vf 3 C,ien ' re guest 
N connected to LAN 55 2 ? SerVefS ^ '=1.2.... 

- SN. Theclie^sesthe^ 9 ' P 
address asaclusteraddrTss^ 
•he server cluster 54 Th P rf^ eC " n9 i,s re guest to 

er 70 broadcasts any Smtnn ^ RA ' Therau '' 
^ 9hos, , P a«J l J^'3r« P a ^e«s having 
servers 54-i 0 f Ihe server cluT. r L C ° nneC,in9 the 
4s guest packet is receiv^ '^h ^ lhe re " 

Each of the servers 54" of ,h L ? ^ 544 
ments a filtering routine inorl C ' US,er 54 im P'e- 
01 the sen/ers M-i P S* to « ■« only one 

f tering routine may SSf, ^ c,ien « 'equest. The 
e of , he sefV ers 54-Un a^. " ^ driver °' each 
so each o, the servers 54 i 1 as s T^ ^ irnptem entetion, 
"on CD) number. The firterin„ 9 "! 3 W ' qUe iden,ifica - 
54- computes a hash ^of « 8 9iVe " -«r 

compares it , 0 Ihe (D ££Z%*** 'P address and 
hash valueandthe IDnumbe 1 , 9 ' Ven S6rVer " th e 

value and the ID number Z 6 PaCke '' " ^e hash 

Packet through a conventbnaHP " reC6ived 'he 
nventional IP routing mechanism In 
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the illustrative example of FIG. 5, a packet associated 
with request from client 52 is broadcast by the router 70 
to each of the servers 54-i of the server cluster 54 over 
the LAN 56 as previously noted. The filtering routine of 
server 54-2 generates a hash value of the client IP ad- 
dress which matches the unique ID number associated 
with server 1 4-2, and server 1 4-2 therefore accepts and 
processes the packet. The filtering routines of the N-1 
other servers 54-i each indicate no match between the 
client IP address and the corresponding server ID 
number, and therefore discard the broadcast packet. 
The reply packets are sent back to ihe client 52 via rout- 
er 70, as indicated by the dashed lines : using the ghost 
IP address. 

The broadcast-based dispatching technique of FIG. 
5 may be implemented using a permanent ARP entry 
within the router 70, to associate the ghost IP address 
with the Ethernet or other local network broadcast ad- 
dress associated with LAN 56 of the cluster 54. A po- 
tential problem is that any reply packet from a selected 
server appears to be coming from the ghost IP address, 
and may therefore cause the router 70 to overwrite the 
entry in its ARP cache such that the ghost IP address is 
associated with the LAN address of the selected server 
This potential problem may be addressed by setting up 
a routing table entry in the router 70 to direct all packets 
having a ghost I P destination address to a second ghost 
IP address which is a legal subnet address in the LAN 
56 of the server cluster 54 but is not used by any server 
In addition, an entry is inserted in the ARP cache of the 
router 70 to associate the second ghost IP address with 
the broadcast address of the LAN 56. When the router 
70 routes a packet to the second ghost IP address, it 
will then actually broadcast the packet to each of the 
servers 54-i of the cluster 54. Since no reply packet is 
sent from the second ghost IP address, the correspond- 
ing entry of the router ARP cache will remain un- 
changed. Another potential problem is that some oper- 
ating systems, such as the NetBSD operating system, 
do not allow a TCP packet to be processed if it is re- 
ceived from a broadcast address. This potential problem 
may be avoided by a suitable modification to the broad- 
cast address in the LAN packet header attached to the 
packet. 

The routing-based and broadcast-based dispatch- 
ing techniques described in conjunction with FIGS. 4 
and 5 above have been implemented on a cluster of Sun 
SPARC workstations. The NetBSD operating system, 
as described in NetBSD Project, <http:/www. NetBSD. 
org>, was used to provide any needed kernel code mod- 
ifications. The dispatching overhead associated with 
both techniques is minimal because the packet dis- 
patching is based on simple IP address hashing, without 
the need for storing or searching any address-mapping 
information. In the routing-based dispatching technique, 
the additional routing step in the dispatcher 64 typically 
adds a delay of about 1 to 2 msecs to the TCP round- 
trip time of each incoming request packet. A study in W. 
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R. Stevens, TCP/IP Illustrated, Volume 3, pp. 185-186, 
which is incorporated by reference herein, indicates that 
the median TCP round-trip time is 187 msecs. The ad- 
ditional delay attributable to the routing-based dispatch- 
s ing is therefore negligible. Although the additional rout- 
ing step for every request packet sent to the ghost IP 
address may increase the traffic in the LAN of the server 
cluster, the size of a request in many important applica- 
tions is typically much smaller than that of the corre- 

10 sponding response, which is delivered directly to the cli- 
ent without the additional routing. In the broadcast- 
based dispatching technique, the broadcasting of each 
incoming request packet on the LAN of the server clus- 
ter does not substantially increase network traffic. Al- 

15 though a hash value is computed for each incoming 
packet having the ghost IP destination address, which 
increases the CPU load of each server, this additional 
computation overhead is negligible relative to the corre- 
sponding communication delay. 

20 Both the routing-based and broadcast-based dis- 
patching techniques of the present invention are scala- 
ble to support relatively large numbers of servers. Al- 
though the dispatcher in the routing-based technique 
could present a potential bottleneck in certain applica- 

25 tions, a study in the above-cited D M. Dias et al. refer- 
ence indicates that a single dispatcher can support up 
to 75 server nodes, which is sufficient support for many 
practical systems. The number of servers supported 
may be even higher with the present invention given that 

30 the routing-based dispatching functions described here- 
in are generally simpler than those in the DM Dias et 
al. reference. It should also be noted that additional scal- 
ability can be obtained by combining the routing-based 
dispatching of the present invention with a DNS round- 

35 robin technique. For example, a DNS server may be 
used to map a domain name to one of a number of dif- 
ferent ghost IP addresses belonging to different server 
clusters using a round-robin technique. In the broad- 
cast-based dispatching technique, there is no potential 

40 dispatching bottleneck, although the device drivers or 
other portions of the servers may need to be modified 1 
to provide the above-described filtering routines. 

The routing-based and broadcast-based dispatch- 
ing of the present invention can also provide load bal- 

4$ ancing and failure handling capabilities. For example, 
given N servers and a packet from client address CA, 
the above-described routing-based dispatching function 
may compute a hash value k as CA mod (N- 1 ) and select 
server kXo process the packet. More sophisticated dis- 

50 patching f unctions can also be used, and may involve 
analyzing the actual service access log to provide more 
effective load balancing. In order to detect failures, each 
server may be monitored by a watchdog daemon such 
as the watchd daemon described in greater detail in Y. 

55 Huang and C. Kintala, "Software Implemented Fault Tol- 
erance: Technologies and Experience," Proceedings of 
the 23 rd International Symposium on Fault-Tolerant 
Computing - FTCS, Toulouse, France, pp. 2-9, June 
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8. The method of claim 6 wherein the dispatching step 
includes reapplying the hash function to the client 
IP address to identify another servers if a server 
identified as a results of a previous application of 
the hash function has failed. 

9. The method of any of claims 1 to 5 wherein the 
processing step includes the steps of: 

routing client requests directed to the common 
address to a dispatcher connected to a local 
network associated with the plurality of servers; 
and 

selecting a particular one of the servers to proc- 
ess a given client request based on application 
of a hash function to a corresponding client ad- 
dress in the dispatcher. 

10. The method of any of claims 1 to 5 wherein the 
processing step includes the steps of: 

broadcasting a given client request directed to 
the common address to each of the plurality of 
servers over a local network associated with 
the servers; and 

implementing a filtering routine in each of the 
plurality of servers so that the given client re- 
quest is processed by only one of the servers. 

11. The method of claim 10 wherein the implementing 
step includes the steps of: 

applying a hash function to a client IP address 
associated with the given client request; and 
comparing the result of the applying step to an 
identifier of a particular server to determine 
whether that server should process the given 
client request. 

12. An apparatus for routing client requests to a plurality 
of servers configured to support a network service 
over a communication network, each of the servers 
having a primary address, the apparatus compris- 
ing: 

means for assigning a common address as a 
secondary address for each of the plurality of 
servers; and 

means for processing client requests directed 
to the common address such that each of the 
requests is processed by a particular one of the 
plurality of servers. 

13. The apparatus of claim 12 wherein the processing 
means is operative to dispatch a request of a given 
client to one of the plurality of servers based on ap- 
plication of a hash function to an IP address of the 
given client. 



14. The apparatus of claim 13 wherein the hash func- 
tion is determined based on an analysis of a distri- 
bution of client IP addresses in an access log asso- 
ciated with one or more of the servers. 

5 

15. The apparatus of claim 13 wherein the processing 
means is further operative to reapply the hash func- 
tion to the client IP address to identify another serv- 
er if a server identified as a result of a previous ap- 

io plication of the hash function has failed. 

16. The apparatus of any of claims 1 2 to 1 5 wherein the 
processing means further includes a dispatcher 
connected to a local network associated with the 
plurality of servers, wherein the dispatcher is oper- 
ative to receive client requests directed to the com- 
mon address, and to select a particular one of the 
servers to process a given client request based on 
application of a hash function to a corresponding 
client address. 

1 7. The apparatus of any of claims 1 2 to 1 5 wherein the 
processing means further includes: 

means for broadcasting a given client request 
directed to the common address to each of the 
plurality of servers over a local network associ- 
ated with the servers; and 
means for filtering the given client request in 
each of the plurality of servers so that the given 
client request is processed by only one of the 
servers. 

The apparatus of claim 17 wherein the filtering 
means is operative to apply a hash function to a cli- 
ent IP address associated with the given client re- 
quest, and to compare the result of the applying 
step to an identifier of a particular server to deter- 
mine whether that server should process the given 
client request. 

19. An apparatus for routing client requests for a net- 
work service over a communication network, the 
apparatus comprising: 

a plurality of servers configured to support the 
network service, each of the servers having a 
primary address and a secondary address, 
wherein a common address is assigned as the 
secondary address for each of the plurality of 
servers; and 

a router coupled to the servers and operative 
to route client requests directed to the common 
address such that each of the requests is proc- 
essed by a particular one of the plurality of serv- 
ers. 

20. The apparatus of claim 19 wherein the router is fur- 
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